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Abstract. Nested words, a model for recursive programs proposed by Alur and Mad- 
husudan, have recently gained much interest. In this paper we introduce quantitative 
extensions and study nested word series which assign to nested words elements of a semi- 
ring. We show that regular nested word series coincide with series definable in weighted 
logics as introduced by Droste and Gastin. For this we establish a connection between 
nested words and the free bisemigroup. Applying our result, we obtain characterizations 
of algebraic formal power series in terms of weighted logics. This generalizes results of 
Lautemann, Schwentick and Therien on context-free languages. 



Model checking of finite state systems has become an established method for automatic 
hardware and software verification and led to numerous verification programs used in in- 
dustrial application. In order to verify recursive programs it is necessary to model them 
as pushdown systems rather than finite automata. This has motivated Alur and Madhusu- 
dan [3,4J to define regular nested word languages and visibly pushdown languages. The 
latter is a proper subclass of the context-free languages and exceeds the regular languages. 
Both classes are closely related. Nested words on the one hand have a linear sequential 
structure and on the other hand have a hierarchical structure. This way they may also 
be used to model linguistic data as well as semistructured data such as XML documents. 
Nested words and visibly pushdown languages gained much interest and set a starting point 
for a new research field (see e.g. [H0[7] among many others). 

The goal of this paper is: 1. to introduce a quantitative automaton model and a 
quantitative logic for nested words that are equally expressive, 2. to establish a connection 
between nested words and alternating texts, a graph representation of the free bisemigroup 
which is an object studied by Esik and Nemeth [T7] and Hashiguchi et al. |19H21j . 3. to 
give a characterization of the important class of algebraic formal power series by means of 
weighted logics. 
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In order to model quantitative aspects, extensions of existing models such as weighted 
automata were investigated. There, transitions of automata additionally carry a weight 
which can be of very different nature (e.g. counting, probabilities, etc.). In fact, weighted 
automata have found many different applications e.g. in image processing [10], in speech 
recognition [33] or as a model for probabilistic systems |5,6j. In this paper we introduce 
and investigate weighted nested word automata which may serve as a quantitative model 
for sequential programs with recursive procedure calls. Due to the fact that we define them 
over arbitrary semirings, they are very flexible and can model, for example, probabilistic or 
stochastic programs of recursive nature as well as quantitative database queries. 

Since weighted nested word automata and weighted pushdown automata are closely 
related, one should also mention that weighted pushdown systems have been applied to 
data flow analysis (see e.g. [23 , 24J ) . There, however, the emphasize lies on the (weighted) 
configuration graph of the system which is used to model the state space of a program. 
Weights are incorporated in order to model, for example, the data of the program. In [ 23|l24j 
weighted versions of reachability problems in such graphs were considered. 

In this paper we are interested in the semantics of a weighted automaton given as a 
mapping which assigns a value to each nested word. As the first main result of this paper 
we characterize the expressiveness of weighted nested word automata using weighted logics, 
generalizing a result of Alur and Madhusudan. Weighted logics were introduced by Droste 
and Gastin They enriched the classical language of monadic second-order logic with 
values from a semiring in order to add quantitative expressiveness. This way one may now 
e.g. express how often a certain property holds, how much execution time a process needs 
or how reliable it is. The result of Droste and Gastin has been extended to infinite words, 
(infinite) trees, texts, pictures and traces [I4l[l5l[l8j|28 , 33 , 36] . We note, moreover, that a 
restriction of Lukasiewicz multi- valued logic coincides with this weighted logics [38] . 

In order to prove our result mentioned above we establish a new connection between 
alternating texts and nested words and reduce the result to an analogous one for alternat- 
ing texts. The class of alternating texts, introduced by Ehrenfeucht and Rozenberg [16| . 
forms the free bisemigroup which was also investigated by Hashiguchi et al. |19H21j . More- 
over, a language theory for series-parallel-biposets, a different representation of the free 
bisemigroup, was developed by Esik and Nemeth |17| . Besides the author's opinion that a 
reduction to a previously known result is mathematically more elegant than e.g. a struc- 
tural induction, the approach admits the advantage that it gives insight into relationships 
and similarities between different structures considered in the literature and therefore offers 
benefits. For example, decidability results for the emptiness and equivalence problem come 
almost for free as a corollary. Note that this extends the classical satisfiability problem for 
monadic second order logic, which is one motivation of transforming formulas in automata. 

Furthermore, we can use the connection again in this paper to obtain a new charac- 
terization of algebraic formal power series. The latter form an important generalization 
of context-free languages. Algebraic formal power series were considered initially already 
by Chomsky and Schiitzenberger [8] and have since been intensively studied by Kuich and 
others. For a survey see [25J or [26J. Using projections of nested word series and apply- 
ing the logical characterization of weighted nested word automata, we are able to give a 
characterization of algebraic formal power series in terms of weighted logics, generalizing a 
result of Lautemann, Schwentick and Therien [27] on context-free languages. The connec- 
tion between alternating texts and nested words is then used to also generalize a second 
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characterization of [27], thereby giving a different proof also for the result of Lautemann, 
Schwentick and Therien. 

The paper is organized as follows. In Section [2] we introduce nested words, weighted 
automata for nested words and give an example for them. In Section[3]we introduce weighted 
logics for nested words, introduce different fragments of the latter and state the first main 
result, the characterization of regular nested word series in terms of weighted logics. In 
Section @] we introduce alternating texts, a graph representation of the free bisemigroup 
and define a weighted version of Esik and Nemeth's parenthesizing automata operating 
over elements of the free bisemigroup. Next, in Section [5l we define an embedding of nested 
words into alternating texts and show that we can translate weighted formulae as well as 
automata back and forth with respect to this embedding. This gives the proof of the first 
main result. After that, in Section [61 we apply the result and obtain characterizations of 
algebraic formal power series in terms of weighted logics. 

An extended abstract of this paper appeared as |29j . This paper differs from it in the 
following way. First, full proofs are included. Second, the first main result, the logical 
characterization of regular nested word series, has been extended and it is shown that an 
existential fragment of weighted logics suffices to characterize weighted automata over nested 
words. Third, rather than translating nested words to sp-biposets, the graph representation 
of the free bisemigroup used by Esik and Nemeth |17j , we translate it to alternating texts, a 
different representation. This admits the advantage that we can more easily obtain a second 
characterization of algebraic formal power series in terms of weighted logics. This second 
characterization, which we include here in full length, was only sketched in the concluding 
remarks of |29] and gives the fourth main difference. 



2. Weighted Automata on Nested Words 

In this section we recall the notion of nested words which was introduced by Alur and 
Madhusudan [4J and we define weighted automata for them. Let A be a finite alphabet 
and let A + be the free semigroup of finite but non-empty words. Let w = a\ . . . a n G A + . 
The length of w is \w\ = n. A nesting relation v of width n (n € IN) is a binary relation on 
[n] = {1, . . . , n} such that for all 1 < i,j < n: 

(1) if v{i,j), then i < j, 

(2) if v(i,j) and v(i,j'), then j = j' and if j) and v(i',j), then i = i', 

(3) if v(i,j) and u(i',j') and i < i! then either j < i! or j' < j. 

If we say i is a call position and j is a return position. Any 1 < i < n which is 

neither a call nor a return position is called an internal position. We collect all nesting 
relations of width n in Nest n . 

Definition 2.1 (Alur & Madhusudan [3J). A nested word (over A) is a pair (w,^) such 
that w G A + and v is a nesting relation of width \w\. 




Figure 1: A visualization of the nested word (aacacabb, {(1, 2), (3, 8), (5, 7)}) 
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We collect all nested words over A in NW(A). Let nw = (w,u) G NW(A) where 
w = a\ . . . a n . The factor nw[i,j] for 1 < i < j < n is the restriction of nw to the positions 
from i to j; more formally nw[i,j] = (a$ . . . a,j,v[i, j]) where u[i,j] = {(k,£) | 1 < k,£ < 
j — i + 1, (k + i — 1, £ + i — 1) 6 y}. Furthermore, we say a pair (fc, ^) G ^ is a surface arch 
of if there does not exist (k',£') G with k! < k < £ < £'. 

Nested words have been introduced in order to model executions of recursive programs 
as well as nested data structures such as XML documents. Here, we model quantitative 
behavior of systems or documents such as the runtime or the probability of an execution of 
a randomized program, or the number of occurrences of a certain type of entry in an XML 
document. We do this by assigning to a nested word a quantity expressing, for example, 
the runtime or the probability or the number of entries. 

Example 2.2. 

(1) As Alur and Madhusudan point out, XML documents or bibtex databases can naturally 
be modeled as nested words, where the nesting relation captures open and close tags [I]. 
Suppose we model bibtex databases as nested words. Then we may assign to a nested 
word e.g. the number of technical reports it stores. 

(2) Probabilistic automata have been used to model sys- p rQ c bar(){ 
terns with uncertainty, such as communication systems r ead(x) ■ 

over lossy channels, to model fault-tolerant systems or to flip (y) • if (Y==head) 
model randomized programs. Consider the randomized beep- 
recursive pseudo-procedure bar where flip(Y) means e j ge 
flipping a fair coin Y. Consider furthermore the alphabet bar() ■ 

A = {r,w,b, call, ret} of atomic events which stand for fiip(y) • while (Y==head) 
read, write, beep, call and return. Now, an execution write (x) • 

of bar could be as follows: read(x), flip a coin and see flip(Y) ■ 

tail, call recursively bar, read(x), flip a coin and see exit-} 
head, beep, flip a coin and see tail, return from the re- 
cursive call, flip a coin and see head, write (x), flip a coin and see head, write (x), 
flip a coin and see tail, exit the program. Then the nested word nw = (w, v) defined 
by w = r. call. r.b.ret.w.w. ret and v = {(2,5)} models this execution of bar where v 
encodes the recursive call of bar. We calculate the probability of the execution by mul- 
tiplying the probability of each atomic action (probability 1/2 for those actions that 
depend on a coin flip), i.e. 1 • 1/2 • 1 • 1/2 • 1/2 • 1/2 • 1/2 • 1/2 = 1/64. We will model 
bar using a weighted nested word automaton in Example 12. 4| below. 

To be as flexible as possible, we take the quantities we assign to a nested word from a 
commutative semiring. A commutative semiring IK is an algebraic structure (IK, +, •, 0, 1) 
such that (IK, H-, 0) and (1K,-,1) are commutative monoids, multiplication distributes over 
addition and is absorbing, i.e. • k = k ■ = for all k G K. For example the nat- 
ural numbers (IN, +, •, 0, 1) form a commutative semiring. Other important examples are 
also the tropical semiring (Z U {oo}, min, +, oo, 0) and the arctic or max-plus semiring 
(Z U {— oo}, max, +, — oo,0) which have been used to model real-time systems or discrete 
event systems. These semirings possess the property that any finitely generated submonoid 
of (IK, +, 0) is finite. Such semirings are called additively locally finite. Another important ex- 
ample of an additively locally finite semiring is the probabilistic semiring ([0, 1], max, •, 0, 1). 
We call a semiring locally finite if any finitely generated subsemiring is finite. Examples 
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include any Boolean algebra such as the trivial Boolean algebra IB = ({0, 1}, V, A, 0, 1) as 
well as (R+ U {oo}, max, min, 0, oo) and the fuzzy semiring ([0, 1], max, min, 0, 1). 
In the following let IK be a commutative semiring such that 0^1. 

Definition 2.3. A weighted nested word automaton (WNWA for short) is a quadruple 
A = (Q, l, 5, k) where 5 = (<5 ca ii, 8mt, ^ret) such that 

(1) Q is a finite set of states, 

(2) 6 can , di n t : Q x A x Q — > K are the call and internal transition functions, 

(3) 5 re t :QxQxAxQ->Kis the return transition function, 

(4) l, k : Q — > K are the initial and final distribution. 

A run of A on nw = (a± . . . a n , v) is a sequence of states r = (qo, . . . , q n ); we also write 
r : qo — t q n . The weight of r at position 1 < j < n is given by 

<5 C aii(<7j-i, «j, Qj) if Kj, for some j <i<n 

wgt_4(r, j) = < Si Q t(qj-i, dj, qj) if j is an internal position 

(5 r et(^-i,%-i,aj,^) if i>(i, j) for some l<i<j. 

Now, the weight of r is wgt^(r) = rii<j<n w §^( r ) i) an d the behavior \\A\\: NW(A) — > K 
of ^4, is defined by 

A function S : NW(A) — > IK is called a nested word series. As for formal power series 
we write (S,nw) for S(nw). We define the scalar multiplication . and the sum + pointwise, 
i.e. for k G IK and any two nested word series S\,S2 we let (k.S\,nw) = k ■ (S\,nw) and 
(Si + S 2 ,nu>) = (Si, nw) + (S 2 ,nw) for all G NW(A). For L C NW(A) let t L be the 
characteristic series of L, i.e. the series that assumes 1 for all nw G -L and otherwise. A 
nested word series S is regular if there is a WNWA A such that ||-4||= S 1 . For K = B, i.e. 
when 6^11, 6{ n t and 5 re t are subsets of Q x A x Q and Q x Q x A x Q, or in other words 
when the transitions do not carry a weight, Definition 12.31 is equivalent to the definition of a 
(unweighted) nested word automaton [4J. A language of nested words L C NW(A) is then 
called regular if it is accepted by a nested word automaton. It is easy to see that this is the 
case iff the characteristic series 1l '■ NW(A) — > B is regular. 

Example 2.4. The procedure bar of Example 12.21 can be modeled by a WNWA over 
IK = ([0, 1], max, ■, 0, 1) with four states {q%, . . . , q^}. The transitions (only those with non- 
zero weight) are given as follows. We let t{qi) = 1 and n{qi) = 1. Moreover, 

$int(qi, r,q 2 ) = 1, <W<72,M3) = <W<?3,w,g 3 ) = 5- m t{q-i,ret,q4) = 1/2 

S ca ii(q2,call,qi) = 1/2, 5 ret (q 3 ,q2,ret,q 3 ) = 1/2. 

Intuitively, each of the states corresponds to a line in the procedure bar which is the next 
to be executed. q\ corresponds to line 2, q 2 corresponds to line 3, ^3 corresponds to line 
7 and q^ is only reached at the end of an execution. Consider the nested word nw of 
Example 12.2( 2). There is exactly one run r : q\ ™ q^ with wgt(r) 7^ 0. We start in state q\ 
execute r and change to q 2 - We then call and change back to q\. After that we execute r 
again and change to state q 2 . We then execute b and change to q 3 . We return and stay in 
q 3 . Now we execute w twice while staying in q 3 and finally end at state q^. Observe that 
the automaton assigns 1/64 to the nested word nw. 



6 



C. MATHISSEN 



3. Weighted Logics 

In this section we introduce another formalism for specifying nested word series. For 
this we interpret a nested word nw = (ai . . . a n , u) as a relational structure consisting of the 
domain dom(nu>) = [n] together with the unary relations Lab a = {i G dom(nu>) | aj = a} 
for all a G A, the binary relation v and the usual < relation on dom(nu>). 

First, we recall classical monadic second-order logic. The set MSO(A,<,^) (we also 
write MSO for short) is given by the following grammar. 

ip ::= x = y | Lab a (x) | x < y \ v(x, y) \ x G X \ tp V ip \ -up \ Bx.tp \ BX.ip 

where a ranges over A, where x,y are first-order variables and where X is a second-order 
variable. As usual we abbreviate x < y = < x), (p — > ip = —up V tp and (p O ip = 
(tp -> tp) A (ip ->• <p>) for any (p, ip G MSO. 

Let tp G MSO and let Free(ip) denote the set of variables that occur free in ip. Let V 
be a finite set of first-order and second-order variables such that Free(y?) C V. A (V, nw)- 
assignment 7 is a mapping from V to the powerset ^(dom(mo)) such that first-order 
variables are mapped to singletons. For i G dom(nw) and T C dom(nw) we denote by 
j[x — > i] (resp. j[X — > T]) the (V U {x}, nw)-assignment (resp. (V U {A}, nw)- assignment) 
which equals 7 on V\ {x} (resp. V\ {A}) and assumes {i} for x (resp. T for A). We write 
(nw, 7) \= (p if (p holds in nw under the assignment 7. We write ip(x±, . . . , x n ,X\, . . . , X m ) 
if Free(y?) C {x±, . . . , x n , X±, . . . , X m }. In this case write |= tp\i\, ■ ■ ■ ,i n ,Ti, . . . ,T m ] 
whenever we have (nw,j) \= tp if j(xj) = {ij} and 7(Aj) = Tj. This is justified by 
the fact that (nw,^) \= ip only depends on the restriction 7|Free(^) °f 7 to Free(</?). Let 
^fv(y) = {( nw j"f) I nw ^ NW(A),7 is a (V, nw)-assignment, (nw,^f) (= <^}. Abbreviate 
«Sf(</?) = -S?Froc(^)( 1 ^)- Note that in case that 93 is a sentence, i.e. Free((/j) = 0, we consider 
JS%) as a subset of NW(A). 

Let Z C MSO. A language L C NW(A) is Z-definable if L = if (y?) for a sentence 
(p £ Z. Formulae containing no quantification at all are called propositional. First-order 
formulae, i.e. formulae containing only quantification over first-order variables are collected 
in FO. The class EMSO consists of all formulae <p of the form 3X\ .... 3X m .ij) where tp G FO. 
Alur and Madhusudan showed that monadic second-order logic and nested word automata 
are equally expressive. 

Theorem 3.1 (Alur & Madhusudan [3,4J). A nested word language L C NW(A) is regular 
iff L is MSO- definable iff L is EMSO- definable. 

We now turn to weighted monadic second-order logic as introduced in The set 

MS0(1K, A, <,v) (once again we shortly write MSO(IK)) of weighted MSO formulae over IK 
is given by the following grammar: 

if ::= k I x = y \ Lab a (x) | x < y \ v(x, y) \ x G A 

I = y) I -1 Lab a (x) j ->x < y \ ->v(x, y) \ ^(x G A) 

I cp V (p I cp A (p I Bx.ip I 3X.<p j Mx.ip I MX.ip 

where k G IK, where a ranges over A, where x,y are first-order variables and where A is a 
second-order variable. Note that we allow negation only for atomic formulae, i.e. for the 
formulae x = y, Lab a (x), x < y, v(x,y) and x G A. This is because in general semirings 
we do not have a natural complement and hence it is not clear how to define the semantics 
of negation for values other than and 1 (cf. [111). 
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Let ip € MSO(IK) and Pree(y) C V. The weighted semantics \ip\v of (p is a function 
assigning a value in IK to a nested word nw and a (V, nu;)-assignment 7. To each such pair 
(nw,j) we assign an element of IK inductively as follows. For A: € IK we put [[/^(nii;, 7) = k. 
For every other atomic formula or negated atomic formula ip the semantics [[</?]] y is given by 
the characteristic function t^ v ^y Moreover, we define 



ftp V ip} v (nw,~f) 












[<p]v(rau>,7) • Mv(fi!«,T), 




I3x.cpj v (nw,j) 




y]. c , , ,Hvuw(™>7M' 




{IX.ipjv (nw, 7) 




ETCdo m (n W ) MvU W (nW ' 7[X - 


+ 31) 


[\/x.^lv(nu;,7) 




n ^ / Jfhu{x}(nw,-f[x ^ i 


]). 


[VX^] v (nu;,7) 




UTC dominw) M vu { x } (n W ,7[X - 


»T]). 



We put [(/?]] = [yljveefo)- Observe that in the case where ip is a sentence, [yj] can be 
considered as a series from NW(A) to IK. 

Remark 3.2. A formula 92 £ MSO(IK) which does not contain a subformula k E IK can be 
interpreted as an unweighted formula. We will use this implicitly in the sequel. Moreover, 
note that if IK is the Boolean semiring B, then weighted logics and classical MSO logic 
coincide. In this case k is either (false) or 1 (true). 

Example 3.3. 

(1) As in Example 12.21 suppose we model bibtex databases as nested words. Moreover, 
assume that tecrep € A marks the beginning of an entry containing a technical report. 
Now, let IK = IN be the semiring of the natural numbers. Then ({3x. Labte C repOc)J> nw) 
counts the number of technical reports of the bibtex database modeled by nw. 

(2) Again let IK = IN. Consider the formula ip = \/x.3y.l. Then (px.lfl, (ai . . . a n , v)) = n 
and ([Vy.3x.lJ, (a± . . . a n , v)) = n n . It can be shown as for words that \ipj is not regular 
as it grows too fast (cf. Example 3.4 in |llj). 

Let Z C MSO(IK). A series S : NW(A) -)■ IK is Z-definable if S = M for a sentence 
(p € Z. Example 13. 3( 2) shows that unrestricted application of universal quantification does 
not preserve regularity. Therefore we now define different fragments of MSO (IK). 

Note that the fragment RMSO(IK), the collection of restricted formulae, which was con- 
sidered in [11] and which characterizes regular formal power series is a semantic restriction, 
and it is not clear whether membership in RMSO(IK) can be decided. In order to have a 
decidable fragment, we now syntactically define the fragment sRMSO(IK). For this we follow 
the approach of [12] , 

The idea is to restrict universal first-order quantification to formulae having a semantics 
that takes on only finitely many values. To this aim we start by identifying a class of 
formulae ip that take on values and 1 only, more precisely we will have ljf v (^) = [y]v- 
The problem that arises is that by definition of the semantics, V gets translated by means 
of +. Hence, for a formula ip = ipi V ip2 we only want to evaluate (p% if tpi evaluates to 0, 
otherwise we might end up with a sum greater than one. A similar problem occurs for 3x. 
and 3X. 
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Given a classical (unweighted) MSO-formula tp we assign to it formulae p + and tp~ such 
that l<-p + } = ljsf(<p) and [y - ] = The crucial point is that we have a linear order at 

disposal. 

(1) If tp is of the form x = y, Lab a (x), x < y, v(x,y), x G X then ip + = tp and p~ = ~^p. 

(2) If p = —tip, then p + = tp~ and tp~ = tp + . 

(3) If <£> = tp V -0', then <£+ = tp + V (V>~ A tp'+) and 99" = V~ A 

(4) If ip = Bx.Tp(x), then p + = 3x.ip(x) + A Vy.(y < x A tp(y))~ and </?~ = \/x.ip{x)~ . 

In order to disambiguate set quantification, we have to define a linear order on the subsets 
of the domain of a nested word or equivalently on nested words (of fixed length) over the 
alphabet {0, 1}. We take the lexicographic order < which is given by the following formula. 

X < Y = 3y.y G Y A ^y G X A Vz. \z < y ->■ (z G X O z G T)] + 

Now we proceed: 

(5) If </? = 3X^(X), then ^+ = 3X.^(X)+ A VY.(Y < X A ip(Y))- and =VX^(X)-. 
Formulae of the form p + or p~ for some 99 G MSO are called syntactically unambiguous. 
Observe, if p is syntactically unambiguous, then [</?]y = 1 £g v Up) f° r an Y finite set of variables 

V 5 Free(( / 9). In the following, we shortly write <p — > for p~\/(p + Aip) for any two weighted 
formulae ip, tp where p does not contain subformulae of the form k (k G K) and hence is also 
a classical formula. 

We define aUMSO(lK), the collection of almost unambiguous formulae, to be the smallest 
subset of MSO(IK) containing all constants k (k G IK) and all syntactically unambiguous 
formulae which is closed under conjunction and disjunction. Using the distributivity, observe 
that for any tp G aUMSO(IK) there is a formula tp' of the form tp' = V/r=i(^« ^ f° r some 
k{ G IK and syntactically unambiguous tpi such that [tp\ = \tp'\ (cf. [E])- We are now ready 
to define the fragment sRMSO(lK). 

Definition 3.4. A weighted formula ip is in sRMSO(lK) (syntactically restricted MSO) if 
for every subformula of p the following two conditions hold: 

(1) If ■& = \/X.tp for some tp G MSO(IK), then tp is syntactically unambiguous. 

(2) If •& = 1x4 for some tp G MSO(K), then tp G aUMSO(K). 

We collect in sRFO(lK) all p G sRMSO(IK) which do not contain any set quantification and 
we collect in sREMSO(K) all ip G sRMSO(IK) of the form 3X ± . . . . 3X m 4 with tp G sRFO(IK). 

Let now wUMSO(lK), the collection of weakly unambiguous formulae, be the smallest 
subset of MSO(IK) containing all constants k (k G IK) and all syntactically unambiguous 
formulae which is closed under conjunction, disjunction and existential quantification (both 
first- and second-order). We define the fragment swRMSO(lK). 

Definition 3.5. A weighted formula p is in swRMSO(lK) (syntactically weakly restricted 
MSO) if for every subformula $ of p the following two conditions hold: 

(1) If -i? = 1X4 for some tp G MSO(IK), then tp is syntactically unambiguous. 

(2) If ■& = 1x4 for some tp G MSO(K), then tp G wUMSO(lK). 

Clearly, aUMSO(K) C wUMSO(B<) C sRMSO(K) C swRMSO(K) C MSO(K). The first 
main result of this paper is the characterization of regular nested word series using weighted 
logics. It reads as follows. 
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Theorem 3.6. Let K be a commutative semiring and let S : NW(A) — > K be a nested word 
series. Then the following holds. 

(a) S is regular iff it is sRMSO(IK) -definable iff it is sREMSO(IK) -definable. 

(b) If K is additively locally finite, then S is regular iff it is swRMSO(IK) -definable. 

(c) If K is locally finite, then S is regular iff it is MSO(IK)- definable. 

We prove the result at the end of Section [5] by interpreting nested words in alternating 
texts. In the next section we introduce alternating texts and weighted automata for them. 

Example 3.7. The nesting depth of a position i of a nested word nw is the number of open 
call positions (i.e. where the corresponding return position has not occurred yet including 
the position itself). The nesting depth of a nested word is the maximum nesting depth of 
its positions. Let IK = (Z U {— oo}, max, +, -co, 0). 

open(x) = Vy.(y < x A call(y)) ^> 1 A (y < x A return(y)) ^> — 1 where 
call(x) = 3y.v(x,y) and return(x) = 3y.v(y,x) 

Then px.open(x)]] assigns to a nested word its nesting depth. Hence, since 3x.open(x) € 
sRMSO(lK), the series is regular by Theorem 13.61 

4. Alternating Texts 

A bisemigroup is a set together with two associative operations. Several authors in- 
vestigated the free bisemigroup as a fundamental, two-dimensional extension of classical 
automaton theory, see e.g. Esik and Nemeth |T7j and Hashiguchi et al. (e.g. |19H21j ). Esik 
and Nemeth considered as a representation for the free bisemigroup the so-called sp-biposets, 
a certain class of biposets. A different representation of the free bisemigroup over some fi- 
nite set A are the so-called alternating texts (16] [22]. A text over A is a tuple (V, A, <i, < 2 ) 
where <i and <2 are linear orders over a finite but non-empty domain V and A : V — > A is 
a labeling function. Of course we consider texts only up to isomorphism. Therefore, unless 
otherwise specified, the domain of a text will be [n] = {1, ... ,n} for some n £ IN and <i 
will correspond to the canonical order on [n]. 

We define the binary operations o and •, called the horizontal and vertical product, 
on texts as follows: Let r = (V, A, <i, <2) and t' = (V, A', <[, < 2 ) be two texts where we 
assume that V and V are disjoint. Then 

n o t 2 = (v w V, A u A', <! u <i uv x v', < 2 u < 2 uv x v'), 
n • r 2 = (V a V', A u A', <i u <i uv x V', < 2 u < 2 uv' x v). 




Figure 2: A visualization of the alternating text given by (a • a) o (c • a • (c o a o b) • b). 

Here we only give the successor relation of the second order. The first order is 
given simply from the left to the right. 
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Let TXT(A) be the class of texts which can be obtained from the singleton texts 
by finite applications of o,m. This class was named the class of alternating texts in |16| . 
The class TXT(A) together with the operations o, • is the free bisemigroup over A |22| . 
Let monadic second-order logic MSO(A, <i,< 2 ) and weighted logics for texts, denoted 
MS0(1K, A, <i, < 2 ) be defined along the same lines as for nested words. Moreover, define 
sRMSO([K,A, <i,< 2 ) and swRMSO([K, A, <i, < 2 ) using the linear order < x . 

Now we introduce weighted parenthesizing automata (cf. [28]) operating on the free 
bisemigroup generalizing parenthesizing automata as introduced by Esik and Nemeth |17| . 

Definition 4.1. A tuple A = (T~L, V, f2, fi, Hop,fJ-c\, A, 7) is a weighted parenthesizing automa- 
ton (WPA) provided that 

• H and V are finite, disjoint sets of horizontal and vertical states, respectively, 

• Q, is a finite set of parentheses, 

• jj, : {% x A x %) U (V x A x V) — > IK is the transition function, 

• Mopj/Uci : x ft x V) U (V x x %) — > IK are the opening and closing parenthesizing 
functions, respectively, 

• A, 7 : U V — > IK are the initial and final weight functions, respectively. 

We now come to the notion of a run r of A. We given an inductive definition where we also 
define its label lab(r) E TXT(A), its weight wgt^(r) G K, its initial state init(r) G % U V 
and its final state fin(r) G WUV. Formally the set of runs of A is the smallest set of words 
over the alphabet AUflU^UVU{(,)}U{,} such that: 

(1) The word (qi,a, g 2 ) is a run for all (gi, g 2 ) G {% x U (V x V) and a G A. We set 

lab((gi,a,g 2 )) = a G TXT(A), wgt A ((q 1 ,a,q 2 )) = fi(q 1 ,a,q 2 ), 
imt((q 1 ,a,q 2 )) = q\ and fin((gi, a, g 2 )) = g2- 

(2) If n and r 2 are runs such that fin(ri) = init(r 2 ) G T~i (respectively such that fin(ri) = 
init(r 2 ) G V), then r = 77 r 2 is a run having 

lab(r) = lab (77) o lab(r 2 ), (resp. lab(r) = lab(ri) • lab(r 2 )), 

wgt_4(r) = wgt_4(ri) • wgt_4(r 2 ), init(r) = init(rx) and fin(r) = fin(r 2 ). 

(3) If a run r resulting from 2 has init(r) G H (resp. init(r) G V) and if gi, g 2 G V (resp. if 
gi,g 2 G %) and s G il, then r 1 = (gi, ( s , init(r)) r (fin(r), ) s , g 2 ) is a run. We set 

lab(r') = lab(r), init(r') = q\ and fin(r') = g 2 , 

wgtu(r') = Mo P ((gi 5 ( s ,init(r))) • wgt^(r) • /i cl ((fin(r), ) s , g 2 )). 

Let t G TXT(A). Since in (3) above we require that the run r we start with results from 
(2), we do not allow repeated application of (3) and therefore there are only finitely many 
runs r of A with label r. Intuitively, we do not allow for doubled parentheses. If r is a run 
of A with lab(r) = r, init(r) = gi, fin(r) = g 2 , we write r : q\ A g 2 . The behavior of A is a 
text series ||^4||: TXT(A) ->• IK. It is given by 

(M,r)= £ A(gx)- £ wgt^(r)- 7 (g 2 ). 

A text series S is regular if there is a WPA *4 such that ||^4||= S. 

1 We let s € f2 represent both an opening and a closing parentheses. To help the intuition we also 
write ( s or ) s for s. 
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Theorem 4.2 (see [30]). Let K be a commutative semiring and let S : TXT(A) — > IK be an 
alternating text series. Then the following holds. 

(a) S is regular iff it is sRMSO(IK) -definable iff it is sREMSO(IK) -definable. 

(b) If K is additively locally finite, then S is regular iff it is swRMSO(IK)- definable. 

(c) If IK is locally finite, then S is regular iff it is MSO(IK) -definable. 

We note that the proof in [30] is effective, i.e. given an sRMSO(IK) (resp. swRMSO(IK), 
resp. MSO(IK)) formula ip we can effectively construct a WPA A such that jcpj =||^4||, 
and conversely, given a WPA A we can effectively construct p G sREMSO(lK) such that 

[<Pl : 



5. Interpreting Nested Words in Alternating Texts 

We will now derive similar results for nested words as for alternating texts by interpret- 
ing the different structures within each other. For this we utilize definable transductions 
as introduced by Courcelle [9]. We only have to ensure that they preserve definability, now 
with respect to weighted logics. First, we introduce the notion of definable transductions. 
For this let a\ and o"2 = ((-Rj)jg/, p) be two relational signatures where p : / — > N + assigns 
to each relation symbol Ri a positive arity. Moreover, let C\ and C2 be classes of finite a\- 
and ^-structures, respectively. Let monadic second-order logic MSO(cri) and MSO(o"2) be 
defined along the lines as for nested words. 

By a {a \, a 2) -\- copying definition scheme with parameters X\, . . . , X n we mean a tuple 
V = ($,5,(pi) i€l ) of formulae in MSOfVi) such that Free(??) C {Xi,...,X n }, Free(5) C 
{xi,Xi, . . . ,X n } and Free(<pi) C {xi, . . . ,x p ^,X 1 , . . . X n } for all i G I. 

Let V be a (o"i , cr 2 )-l-copying definition scheme, let si € C\ and let T\,...,T n subsets 
of the domain dom(si) of s\ such that si (= i?[Ti, . . . ,T n ]. Then define the cr-structure 
defx>(si,Ti, . . . ,T n ) = S2 with domain dom(s2) C dom(si) and interpretations of relation 
symbols R^ 2 given as follows: 

v G dom(s2) 44> Si |= S[v, T\, . . . ,T n ] for all v € dom(si). 

. . . , u p (i)) G Rl 2 si \= (pi[vt, . . . , v p (i), Ti, ...,T n ] for all i E I and 

all v 1} . . . ,v p ($ G dom(s 2 ). 

By abusing notation, we define the transduction def© Q C\ x C2 by letting (si, S2) G def© 
iff si G C\ and there are sets Ti, . . . ,T n C dom(si) with s% \= i?[Ti, . . . ,T n ] such that S2 = 
defx>(si). Let us call a definition scheme T> with parameters X±, . . . , X n unambiguous if for 
any pair (si, S2) G defp there is at most one assignment of parameters 7 : {X\, . . . , X n } — > 
^(dom(si)) such that defp(si, j(Xi), . . . , j(X n )) = S2- 

Definition 5.1. A transduction <3? C C\ x C2 is unambiguously definable if there is a unam- 
biguous definition scheme X> such that $ = defx>. It is unambiguously FO-definable if there 
is an unambiguous definition scheme V = (t?, 5, (tpi)iei) defining $ with t9, S, (<^i)*eJ £ FO- 

A transduction which is given by a less restricted definition scheme, where one allows 
for more than one copy of s\ and which is not necessarily unambiguous, is called definable. 
Courcelle [9] showed that the preimage of a definable set under a definable transduction 
is again definable. We will show a similar result for series. Let : C% — > C2 be a partial 
function with domain dom(<l>) and let S : C2 — > IK. Define <3? -1 (S') by letting ($ _1 (5'), s\) = 
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(5, 3>(si)) for all si G dom($) and (<J> 1 (S'),si) = otherwise. If <3> is injective, we let 
$(S) = 

Clearly, MSO(IK) can be defined for C\ and C 2 along the same lines as for nested words. 
In order to disambiguate a formula, we need a linear order on each s G C\ (resp. C 2 ). For 
the next proposition we therefore assume that there are binary relation symbols <iG o\ and 
<2G 02 such that the interpretation of <j in s is a linear order for any s € Ci (i = 1, 2). Using 
these linear orders we can define syntactically unambiguous formulae and then sRMSO(lK) 
and swRMSO(IK) over o\ and a 2 . 

Proposition 5.2. Let <3? : C\ — > C 2 be an unambiguously definable partial function. Then 
the following holds: 

(1) If S : C 2 -> IK is MSO(IK) -definable, then so is ^(S). 

(2) IfS : C 2 ->• K is sRMSO(B<) -definable, then so is 

(3) // S : C 2 -> IK is swRMSO(IK)-de/i7ui&/e, t/ien so is S" 1 ^). 

(4) // $ is unambiguously FO-definable and S : C 2 — >■ IK is sREMSO(IK)-de/ma&/e, i/ien 

is sREMSO(K)-<ie^naWe. 



Proof sketch. Full proof and more general results can be found in 130\, \31 ^ . 

Let V = (#,5, (<Pi)iei) be an unambiguous definition scheme defining <I>. Let <p G MSO(IK). 

By induction on the structure of tp we now define the formula tp G MSO(IK, a\). 

k = k, x~^y = (x = y) xGX=xGX R i (x 1 ,...,x p ^) = tp i (x 1 ...x p ^,X 1 ,...,X n ) + 

If <p is x = y, x G X or Ri(xi, . . . , x p (j)) let -i^> = (ip)~ ■ Moreover, let 

tpl A tp 2 = ip\ A ^ 2 

(■01 V V'2) + if V'l V ^2 is syntactically unambiguous 
ipi V ^2 otherwise 



tpl V ^2 
3X^ 



[3x.(5(x, X\, . . . , X n ) A if 3x.vp is syntactically unambiguous 

3x.(5(x, Xi, . . . , X n ) + A V>) otherwise 

[3X.Vx.(x £l-> o~(x, Xi, . . . , X n )) A Pp] + if 3X-V> is synt. unambiguous 
3X.Vx.(x £l-> <5(x, Xi, . . . , X n )) + A V> otherwise 

Vx.V> = Vx.<5(x, Xi, . . . , X n ) tp 
VX^ = VX. (Vx.x G X -> <5(x, Xi , . . . , X n )) -±> 

Now let <^ be as required such that [[99] = S 1 . One can show by induction on the structure 
of tp that pXi, . . . X n .i?(Xi, . . . ,X n ) + A tp\ = $ _1 (S f ). By construction we get that if tp is 
syntactically unambiguous, then so is its translation tp. Again by induction it is therefore 
not hard to see that tp is in aUMSO(IK) (resp. wUMSO(IK)) if <p is in aUMSO(IK) (resp. 
wUMSO(lK)). From this we conclude that the translation is as required. □ 

We are now going to show that regular series coincide with sRMSO(lK)-definable ones. 
For this we define two embeddings of nested words into alternating texts and use the char- 
acterizations of text series. The connection we establish turns out to be useful again in 
Section El Define $.,$ : NW(A) -> TXT(A) as follows. Let nw = (w,u) G NW(A) 
where w = a% . . . a n . If v = 0, then let $ (mu) = a\ o . . . o a n and ^,(nto) = a± • . . . • a n . 
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If v 7^ 0, let i be the minimal call position and j the corresponding return position. Let 
nw' = nw[i + 1, j — 1] and nw" = nw[j + l,n]. Suppose for the moment that i + 1 < j — 1 
and j + 1 < n. We define 

$o(^w) = cl\ o . . . o di—i o (oj • $,(nj/;') • a,) o $> (nw"), 

Q,(nw) = a\ • . . . • Oi_i • («i o f (nis') o a 3 -) • $,(W). 

If i+1 = j or j = n, then we just ignore the terms <& (nw'), $> (nw"), &,(nw') and <&,(nw"), 
respectively, in the definition above. Intuitively, we transform the nesting relation into well- 
matched brackets. As an example consider the nested word nw given in Figured) Its coding 
^ (nw) is the alternating text in Figure 2. 

Let $ (nw) = (V°, A , <?,<!]) and $,(nw) = (V, A*, <J, <5). The following observa- 
tions can easily be made by induction either on re or on \u\: 

(a) Both V° and V* have cardinality re. We therefore assume from now on that V° = 
V* = [n] such that < x as well as <* is the usual order on [n]. It is easy to see that 
A°(i) = A*(i) = ai . 

(b) Both $0 and are injective. 

Recall that a position of nw has odd nesting depth if the number of open call positions is 
odd (see Example 13. 7p . 

Lemma 5.3. Let nw = (a\ . . . a n ,v) G NW(A), let & Q (nw) = ([re], A, <°, <|) and let 
$>,(nw) = ([re], A, <*, <*). Moreover, let 1 <i < j < re. Then we have, i iff i <* j iff 

there is some (k,£) E v with l<k<i<j<£<n such that there is no (k',£') G v with 
k<k'<i<j<£'<£ and k has odd nesting depth. 

Proof. The proof is by induction on \v\. For \u\ = this is trivial. Now let \v\ > 1. We only 
prove that i >2 j iff there is some (k, £) € v with l<k<i<j<£<n such that there is 
no (k',£') E v with k<k'<i<j<£'<£ and k has odd nesting depth. That this holds iff 
i <* j can be shown analogously. Let i' be the minimal call position and j' the corresponding 
return position. Let nw' = nw[i' + 1, j' — 1] and nw" = nw[j' + l,re] provided they exist. 
Moreover, let $.(nw') = ([f - i' - 1], A', <[, < 2 ) and § {nw") = ([re - j'], A", <'/, < 2 '). We 
consider three cases: 

(1) Assume i < i' or i < j' < j. Then i < 2 j and there is no (k,£) G 1/ with 1 < k < i < 
j <l < re. 

(2) Assume i' < i < j < j' ■ If i = i' or j = j', then i > 2 j and choosing (/c,^) = (i,j) gives 
(fc,£) as required since i has nesting depth 1. If i! < i < j < j' , then we get: 

i >2 3 i ~ i' >2 i - i' 

<J=> not i — i <2 j — i 

either there is some (k,£) G v[i' + — 1] with 1 < 
k<i — i'<j — i'<£<j' — i' — 1 such that there is no 
(£',£') G with k < k' < i-i' < j-i' < £' < £ 

and k has even nesting depth in nw' , or there is no (k, £) G 
z^' + l, j'-l] with 1 < fe < i-i' < j-i' <£< j'-i'-l 

there is some (k, £) G v with l<k<i<j<£<n such 
<J=> that there is no (&', £') G ^ with k<k'<i<j<£'<£ 
and A; has odd nesting depth. 
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(3) Assume j' < i. Then we get 

i>° 2 j <=> i-j'^j-f 

there is some (k,£) G v[f + l,n] with l<k<i — j'< 
j—f < £ < n— j' such that there is no € ^[j' + l, n] 

with k<k'<i-j'<j-j'<£'<£ and fc has odd 
nesting depth 

there is some (fc, £) E v with l<k<i<j<£<n such 
that there is no (&', £') G f with k<k'<i<j<£'<£ 
and fc has odd nesting depth. 

□ 

Corollary 5.4. T/ie functions <£ flnc ^ flre unambiguously FO- definable. 

Proof. We only show that $ is FO-definable. For <1>. the claim can be shown analogously. 
We give a 1-copying definition scheme <5, (<^Lab a )aeA) </><i j y< 2 ) with four parameters 
X 1 ,X 2 ,Y 1 ,Y 2 . 

Let the macros call(x) and return(x) be as in Example 13.71 Moreover, let 

Frst„(a;) = call(x) A Vy. call(y) — > x < y 

The next macro defines y, the next call or return position following position x. 

nextjy(x, y) =x < y A (call(y) V return(y)) A \/z.{x < z < y) — > (-> call(z) A -> return(z)) 

We now define the formula "&(Xx, X 2 , Yi,Y 2 ) which for all nw = {a% . . . a n , v) G NW(A) and 
C\ , C 2 , R\ , R 2 ^ [n] has the property that nw \= ~d[C\, C 2 ,R±, R 2 ] iff C\ is the set of all call 
positions of odd nesting depth, C 2 is the set of all call positions of even nesting depth, R\ is 
the set of all return positions of even nesting depth and R 2 is the set of all return positions 
of odd nesting depth. 

■d{X x ,X 2 ,Y x ,Y 2 ) = (Iinl 2 = 0) AVz.(z G X\ V z G X 2 ) — > call(z) 

A (Yi n Y 2 = 0) A \fz.(z G Yi V z G Y 2 ) ->■ return(z) 

A Vz.Frst^z) 4z£li 

A V^i, z 2 .((z\ G -X"i A next^i, z 2 ) A return(z 2 )) — > ^2 G Yi) 

A Mzi, z 2 .((zi eliA next i/ (zi, z 2 ) A call(z 2 )) — s- z 2 G X 2 ) 

A V^i, z 2 .{{z\ e X 2 A next^zi, z 2 ) A return(z 2 )) -> z 2 G Y2) 

A Vzi, z 2 .((zi G X 2 A next iy (zi, z 2 ) A call(z 2 )) — > z 2 G Xi) 

A Vzi, z 2 .((zi G Yi A nextj,(zi, z 2 ) A return(z 2 )) -> z 2 G Y 2 ) 

A Vz\, z 2 .((zi G Yi A nextj,(zi, z 2 ) A call(z 2 )) ->• z 2 G Xi) 

A V^i, z 2 .{{z\ G Y 2 A nextj,(zi, z 2 ) A return(z 2 )) -> z 2 G Yi) 

A V21, z 2 .((zi G Y 2 A nextj,(zi, z 2 ) A call(z 2 )) ->• z 2 G X 2 ) 

where X n Y = abbreviates -.(3z.z elAzeF). We let 5(x, Zi, X 2 , Yi, Y 2 ) be some 
tautology. Now we define the interpreting formulae. We set ipha,b a (x,X\,X 2 , Yi,Y 2 )) = 
Lab a (x) and let ip< 1 (x,y,Xi,X 2 ,Yi,Y 2 )) = x < y. Furthermore, we define (p {x,y,X\) to 
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be the following formula which expresses the condition of Lemma [ 

ip (x,y,Xi) = x < y A (Bzi,z 2 . [z\ < x < y < z 2 ) Nv(z\,z 2 ) A Z\ G X\ A 

A V4, 4- (zi < z[ < x <y < z' 2 < z 2 ) ^ -iv(z[, 4) 
and let 

(p< 2 {x,y,Xi) = x = y V {y < x A tp (y, x, Xi)) V (x < y A -np (x,y, Xi)). 
This completes the definition scheme for <3? which is unambiguous. □ 

Let r = ([n], A, <i, <2) be a text. An interval [i,j] = {k G [n] \ i <i k <i j} of the 
first order is a cZan if it is an interval also of the second order. A prime clan is a clan that 
does not overlap with any other, i.e. there is no clan [k,£] such that k <i i <i £ <i j or 
z <i k <i j <i 1 

Lemma 5.5. Lei to = (a\ . . . a n , v) G NW(A), let <& (nw) = ([n], A, <°, <£) and let 
<&,(nw) = ([n], A, <*, <*)• Moreover, let 1 < i < j < n. 

Then (i,j) G v iff is a prime clan of<& (nw) and we have either i ^ 1, j ^ n or 1 > 2 n 
iff [i,j] is a prime clan of& 9 (nw) and we have either i ^ 1, j ^ n or 1 <* n. 

Proof. The proof is again by induction on \v\. If v = 0, then [l,n] is the only prime clan 
of both & (nw) and &,(nw) (since any other clan can be overlapped) and we have 1 < 2 n 
and 1 >* n. Now let \v\ > 1 and let ji), (^2,^2), ■ ■ ■ , (it,jt) with ii < Z2 < . . . < ^ be 
the sequence of surface arches (see definition after Def . 12. ip . By definition we have 

<£> (nu>) = $ (nw[l,ii - 1]) o $ (nw[ii, o ••• o 

o $ (OT[j t -i + l,i(-l]) o $ (nu)[it,j t ]) o $ (nu;^ + l,n]), 

where we ignore a factor if the corresponding interval is empty. We show that (i,j) G v iff 
[£, j] is a prime clan of & (nw) and we have either i ^ 1, j / n or 1 >2 n. That this holds 
iff is a prime clan of $,(to) and we have either i ^ 1, j ^ n or 1 <* n can again be 
shown analogously. 

(Only if). Let (z, j) G v. Then there is some r such that i r < i < j < j r . 

If i = i r or j = j r , then i = i r and j = j r . Clearly, [i r ,j r ] is a clan. Suppose for 
contradiction that there is a clan [£,k] overlapping [i r ,j r ]. Assume t < i r < k < j r (the 
case v < £ < j r < k is similar). By definition of $ we get I < 2 j r < 2 i r . Contradiction. 
Thus [in Jr] is a prime clan. In particular if i r = 1 and j r = n, we get 1 > 2 n. 

Otherwise, in case of % r < i < j < j r , the interval [i — i r ,j — i r ] is a prime clan of 
<&,(nw[i r + 1, j r — 1]) by induction hypothesis. Thus, [i,j] must be a clan, since [i r ,jV] 
is a clan, too. Suppose for contradiction that there is a clan [£,k] overlapping As 
[i — i r ,j — i r ] is a prime clan of $ m (nw[i r + 1, j r — 1]) we get either £ < i r or k > j r . Assume 
£ < i r (the other case is similar). Now, if I < i r , we can argue as above and separate £ and 
i r . Contradiction. If i r = £ and i r + 1 < i, then [1, k — i r ] is a clan in & 9 (nw[i r + l,j r — 1]) 
which overlaps [i — i r ,j — i r \. Contradiction. And if £ = i r and i r + 1 = i, we get by 
definition i <2 j ir- Again contradiction. Thus must be a prime clan. 

(//). Let [i,j] be a prime clan such that not i = 1, j = n and 1 n. If z = 1 and j = n, 
then 1 >2 n and (i,j) G ^ by definition of ^o- Now suppose 1 < i or j < n. The following 
intervals (provided they exist) can easily seen to be clans: [1, i\ — 1], [h,ji], [ji + 1, n], [1, ji] 
and [^,n] for any £<i\. From this we conclude that either i\ < i < j < j\ or j\ < i since 
otherwise one of the clans above would overlap If i = %\ or j = j\ then i = i\ and 
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j = ji, since [ii,ji — 1] and [i\ + are clans, and hence G v. In the case where 
i\ < i < j < ji, we get that [i — — i\] must be a prime clan of &,(nw[ii + — 1]) 
and if j± < i, we get that [i — ji,j — j\] must be a prime clan of <f> (nw\j\ + 1, n\). Hence, 
in both cases G v by induction hypothesis. □ 

It is not hard to see that the domains of the partial functions <&~ l and are F0- 
definable. Hence, by the last lemma there is a definition scheme without parameters con- 
sisting of FO-formulae which defines <5~ 1 (or alternatively <1>7 1 ). 

Corollary 5.6. The partial functions Q' 1 and <1?~ 1 are unambiguously FO-definable. 

So far we have seen that we can translate a formula over nested words into a formula 
over texts (and vice versa) such that the formulae correspond to each other with respect to 
<3? resp. We will now show that also WPA can simulate WNWA (and vice versa) with 
respect to <3? resp. <!>,. 

Proposition 5.7. Let S : TXT(A) ^ K be regular. Then ^- 1 (S),^~ 1 (S) : NW(A) -> IK 
are regular. 

Proof. We show that $~ 1 (S') is regular. Analogously one can show that &~ l (S) is regular. 
Let V = (H, V,fi,^,// op ,Mci,A,7) be a WPA such that \\V\\= S. We construct a WNWA 
A = (Q,i,S,k) with state space Q = (Ji ttl V) x (fi l±l {%)) such that for all ho,h n G 
vo, v n G V and u G O t±J {i} we have 

wgt_4(r) = ^ wgtp(r) and wgt^(r) = ^ wgt-p(r). 

(5.1) 

Intuitively, in the first component one simulates the states of the WPA and in the 
second component one stores the most recent open bracket. This has to be updated when 
reading a return position using the look-back ability of the WNWA. We give now the formal 
definition of the transition functions. We give it only on certain subsets of their domains. 
In all other cases we set the values to 0. Let a G A, hi, hi G H, v±,v 2 G V, oj\ G VL l+J {i} 
and Cl>2 G Define 

5- mt ({hi,uji),a, (h 2 ,uji)) = fj,(hi,a,h 2 ) 
S- mt ({vi,uji),a, (v2,u)i)) = n(vi,a,v 2 ) 

8 C aii((hi,u)i),a, (vi,u 2 )) = ^/x op (/ii, ( U2 ,v) ■ (i(v,a,vi) 

4aii((wi,^i),a, {hi,u 2 )) = ^2 Ato P (vi, { U2 ,h) ■ n(h,a,hi) 

heH 

5 rct ((h 1 ,uj2),(v 1 ,uj 1 ),a,(v 2 ,uj 1 )) = ^ /x(/ti, a, h) ■ fi c i(h, ) W2 , v 2 ) 

heH 

<5 ret ((ui,W2), {h u u{),a, {h 2 ,ui)) = ^2n(vi,a,v) ■ /j, c i(v, )u, 2 ,h 2 ). 

Observe that for any nw G NW(A) and any run r : qo —> q n of A such that wgt^(r) / 
the second components of qo and q n coincide and the first components are either both in % 
or both in V. 
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Let nw = (ai . . . a n , v). We show Equation 15,11 by induction on \v\. First let v = 0. 
Then for all ho,h n and to G Q W {i} we have 

n 

r:(h ,u)^(h n ,u) /ii,-,/i„_i6«i=l 

n 

h 1 ,...,h n - 1 e'H j = l , *o(n»), 

r:fto — ► 

Similarly we get the claim for <3?,. Now, let v 7^ 0, let be the minimal call position and 
let •£ be the corresponding return position. Let nw\ = nw[l, k — 1], nu>2 = nw[i; + 1,^ — 1] 
and nws = nw\i + 1, n] (we assume that all nested words exist, the cases where they do not 
exist are similar). Then for all ho, h n G T~L and wGfiW {i} we have 

r:(ft, ,w)^>(/in,w) 

= X X wgt^(ri) • d ca ,n((h k -i,u}),a k , (v k ,uJi)) • 

X wgt_4(r 2 ) • (5 ret ((^_i,o;i), (/i fc -i,w),a^, • ^ wgt^(r 3 ) 

7111)0. . . niui 

r2 : ("Ufe ,oji ) — ? (ti£_ ! ,wi ) r3 : — %(h n ,u) 

= X X wgtp(ri) • X^o P (^-i, • v(v,a k ,v k ) ■ 

h k _ 1 ,h e G'H , a i°--- oa k-i, veV 
v k ,v t _ 1& V r ^ h ° -* ^-1 

#i(n»2) t)'eV , $0(7111)3), 

*o(tiid) 

r:h — ► «n 

Again, the claim is shown similarly for <3?.. This concludes the proof of Equation (j5.1j) . 
Now consider the WNWA with states Q' = {_L,?,s,o,«} and transition functions 

5 call> 5 hit' Ket g iven for all a G A and p <E Q' \ {J-} by 

4ai(-L> °> ? ) = ^t(-L) a > s ) = CaiO, o, °) = 4it0> a, o) = ^ all (?, a, ?) = ^' nt (?, a, ?) = 
= <&>t(?,P,a,?) = ^ et (?,-L,a,«) = 4dl( - > a >°) = $Lt(; a >°) = = 

= ^nt(°> a >°) = Ket(°,P, a ,°) = L 

Set any other values of S' csll , 5' int , S' iet to and let the initial distribution t! be given by 
i'{q') = 1 if q' = JL and otherwise. Observe that in the case where the final distribution 
«' is given by K'(q') = 1 if q' = o and otherwise, the behavior of the automaton is the 
characteristic series of the set of nested words nw such that <& {nw) is a o-product. We 
collect such nested words in NW°. In the case where the final distribution k' is given by 
K , (q / ) = 1 if q 1 = • and otherwise, the behavior of the automaton is the characteristic 
series of the set of nested words nw such that $ (nw) is a "-product . We collect such nested 
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words in NW*. Finally, in the case where the final distribution n' is given by n'(q') = 1 if 
q' = s and otherwise, the behavior of the automaton is the characteristic series of the set 
of all singleton nested words, i.e. A. 

Now consider the product of this automaton with A which has states Q xQ' and whose 
transition functions <5 c x all , 5 x t , <5 r x ct is given by letting <5 x all ((g, q'), a, (p,p') = 5 cal i(q,a,p) ■ 
5(q',a,p') for all q,p £ Q and q',p' G Q'. If we define the initial and final distribution l x 
and k x by letting for all h G % and uj G 

L*((h, i), _L) = X(h) t((h, u),±)=J2 K v ) ■ M*>> U h) 

K((h, i), o) = 7 (/i) n{(h, i),o) = ^ Hc\(h, )u,v) ■ j(v), 

and in any other case by setting the value to 0, then the behavior of the resulting automaton 
is 1nw° © ^ > o" 1 ( , S')- Changing the definitions of l x ,k x appropriately gives automata with 
behavior Inw* © ^o" 1 ^) an d 1a $o The automaton obtained from disjoint copies 

of these three automata has hence the behavior &~ l (S). □ 

Proposition 5.8. Let S : NW(A) IK be a regular series. Then $ (S), $.(5) : TXT(A) 

IK are regular. 

Proof. Let A = (Q,l,5,k) be a WNWA. We define a WPA V = (H, V, f2, /x, /x op , fi ch A, 7) 
with 

ft ={g w I g G Q} x ({c,i} W A) and V = {q V | 9 G Q} x ({c, i} t+J A) 

as well as ft = Q such that (||7?||, $ (nju)) = (||„4||,mtf) for all ra G NW(A). To prove the 
result for <£, only A and 7 have to be changed. 

Intuitively, in the first component one simulates the states of the WNWA, in the second 
component one either selects whether the next transition is a call or an internal transition, or 
one stores the letter to simulate a return position with the next bracket. Look-back behavior 
is simulated by storing a state in the opening bracket and closing it at the appropriate return 
position. 

We formally define /j,, /i op , fi c \ as follows. We give the definition only on certain subsets 
of their domains. In all other cases we set their values to 0. 





,0, (02, i)) 


= Siat{qi,a, q2) 




,a, (92,*)) 


= &nt(9i,a, 92) 




,a, (q?,*)) 


= S C3 x\{qi,a,q 2 ) 




(?2 ,*)) 


= <^call(9l,«,92) 


M(9i*>0 


a, (qi,a)) 


= 1 




,a, (qi,a)) 


= 1 


Mop((9i*,»), 




= 1 


Mo P ((9i',»)> 


91, (^,c)) 


= 1 


Mci((^,a), 


)«>(93>*)) 


= <W(9i,92,a,g 3 ) 






= 5 re t((9l,92,a,93 




Hq?,i) 


= 




7(9i^,*) 


= 7(9l) 



We use induction on nw = (a± . . .a n ,v) G NW(A) to show that the defined WPA 
behaves as required. More precisely we show that for all 91,92 G Q 

wgtp(r) = ^ w g^( r ) = w gtp( r )- 



WEIGHTED LOGICS FOR NESTED WORDS AND ALGEBRAIC FORMAL POWER SERIES 19 



This is easy to see if v = 0. Let v ^ and let k be the minimal call position and let I 
be the corresponding return position. Let nw\ = nw[l, k — 1], nw-i = nw[k + 1,£ — 1] and 
nws = nw[£ + l,n] (we assume that all nested words exist, the cases where they do not 
exist are similar). Then 



Y Yl WgtpOl) •/ i op((^,i),(< ?3 >(93i c )) ■M(?3> c )»°fci(«4>*)) ' 
Y w SV( r 2) ■/i((95.*).^,(95,a^)) ■ /J, c i((q%,aj>),) q3 ,(qlf,i)) ■ 

r2:«,«) — ► (?5> l ) 

Y Yj W &A( r l) •< 5 call(93,Ofe,?4) " X W g*^( r 2) ' <W<?5, <?3, ^, <?e) • 

93:94195,9660 nl "X n <"2 

• X W g^( r 3) 

71U>0 

r 3-Q6 +92 

Y w gU( r )- 



r:gi — >-<? 2 



We can proceed analogously for Now the result follows from the definition of A and 
7- □ 

We can now prove Theorem 13.61 

Proof of Theorem \3.(k We prove Theorem 13.6( a). Let 5 : NW(A) — > K be regular. By 
Proposition ESI * (5) : TXT(A) -»■ IK is regular and hence sREMSO(!K)-definable by 
Theorem[01 Now we get that $^ 1 ($ (S')) = S is sREMSO(IK)-definable by Proposition 
and Corollary 15.41 

Conversely, let S : NW(A) ^ IK be sRMSO(IK)-definable. By Corollary ES and Propo- 
sition E21 ®o{S) : TXT(A) ^ IK is sRMSO(IK)-definable and thus by Theorem [42] regular. 
From Proposition 15.71 we conclude that $~ 1 ($ (»S')) = S is regular, too. 

Similarly we get Theorem 13.6( b) from Theorem I4.2f b). Theorem 13.6( c) follows from 
Theorem IMTc). □ 



Again note that all proofs are constructive. Hence, given a sentence (p in sRMSO(IK) 
(resp. swRMSO(IK), MSO(IK)) we can effectively construct a WNWA A such that ||„4||= {cpj. 
Conversely, given a WNWA A we can construct an sREMSO(IK) sentence (p such that 
||^4| I = [(/?J. The following results follow now easily form the corresponding results for series 
over alternating texts [30] . 

Corollary 5.9. Let K be a locally finite semiring or let K be a ring and let S : NW(A) -4 IK 
be regular such that S(NW(A)) C IK is finite. Moreover, let A C IK. Then S~ 1 (A) is regular. 

Corollary 5.10. Let K be a computable field or a computable locally finite semiring and let 
S\,S2 ■ NW(A) — > K be regular. It is decidable whether Si = S2. 
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Corollary 5.11. Let K be a computable zero- sum free semiring and let S : NW(A) — > K be 
regular. It is decidable whether (S,nw) = for all nw G NW(A). 

Note that one motivation of transforming formulae in automata is solving their sat- 
isfiability problem. The last two corollaries can be seen as a extension of this: We have 
shown that given a formula in ip G sRMSO(IK) (resp. <p G swRMSO(lK), resp. ip G MSO(IK)) 
we can effectively translate it into a weighted nested word automaton A. Now, provided 
the semiring is either zero-sum free or locally finite or a field, using the last two corollaries 
we can test whether there is a nested word nw which gets assigned a non-zero value, i.e. 
(\\A\\,nw) = (M,nw)?0. 

6. An Application to Algebraic Formal Power Series 

In this section we consider algebraic formal power series and show that they arise as the 
projections of regular nested word series and regular alternating text series. Applying then 
our logical characterizations of the latter we obtain characterizations of algebraic formal 
power series in terms of weighted logics generalizing results of Lautemann, Schwentick and 
Therien [27] on context-free languages. Algebraic formal power series have been considered 
initially already by Chomsky and Schiitzenberger [8] and have since been intensively studied 
by Kuich and others. Textbooks containing several aspects of algebraic formal power series 
are [37] and [26]. The reader is also referred to the survey articles [25] and [35] . 

Let A* be the free monoid over A and let e denote the empty word. A formal power 
series is a function S : A* — > IK. We denote the empty word by e. Given two formal power 
series Si, 5*2, their Cauchy product, denoted Si ■ S2 or S1S2, is given by (Si ■ S2,w) = 

Euii«i2=»^ 1 ' Wl ^ 2 ' W2 ) f° r a ^ w ^ ^ 1 ® *~' 2 we denote the pointwise product also 

called the Hadamard product and by Si + 52 their pointwise sum. Moreover, if k G IK, then 
the formal power series k.S is given by (k.S, w) = k-(S, w) for all w G A*. Let 1l denote the 
characteristic series of a language L C A*. We identify w and 1j w \. Let X be an alphabet of 
variables such that AflA" = 0. A polynomial P over (AUAf) is a mapping P : (AuA')* — > K 
such that its support is finite, i.e. the set supp(P) = {w£(AU X)* j (P,w) 7^ 0} is finite. 

Definition 6.1. A collection of polynomials (Px)xex over (A U X) is called an algebraic 
system with variables in X. 

The supports of the polynomials Px in the last definition are thus finite sets consisting 
of words of the form U1X1 . . . u^X^Uk+i where Uj G A* and Xj G X. We say that a collection 
(Sx)xex of formal power series Sx '■ A* — > K is a solution of the algebraic system (Px)xex 
if for all X G X, 

Sx = ^2 (Px,uiXi . . . u k X k u k+ i).uiS Xl ■ ■ ■ u k S Xk u k+ i. 

uiX-L...u k X k u k+1 £supp(P x ) 

An algebraic system (Px)xeX is proper if (Px,Y) = (Px,e) = for all X, Y G X. A 
formal power series S having the property that (5, e) = is called quasiregular. A proper 
algebraic system has a unique quasiregular solution [37], more precisely a proper algebraic 
system has exactly one solution (Sx)xeX such that (Sx,z) = for all X G X. 

Definition 6.2. A formal power series S : A* — > K is an algebraic formal power series if it 
is a component of the quasiregular solution of a proper algebraic system. □ 



This definition is given in [37]. In |25II26| a series S is called algebraic if its quasiregular part 1 A + S 
is the component of the quasiregular solution of a proper algebraic system. 
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We note that over the 2-valued Boolean algebra B these scries correspond exactly to 
the e-free context-free languages. The bijection is given by supp. 

To warm up let us discuss some easy manipulations of algebraic systems. For this, let 
us consider some algebraic system (Px)xex- Let X,Y G X. Clearly, it follows directly 
from the definition of a solution that we can substitute an occurrence of Y in some word 
of the support of Px by Py without altering the solutions of the system. More formally: 
Let uYv G supp(Px)- Let (P' x )xex be given from (Px)xex by replacing Px with the 
polynomial 

{^supp(P x )\{uYv} © Px) + {Px,uYv).uP Y V. 

Then (Px)x&x and (P' x )x&x are equivalent, i.e. any solution of (Px)xex is a solu- 
tion of {P'x)xex and vice versa. An algebraic system (Px)xex is called weakly strict, 
if supp(Px) ^ {^} U A(A U X)* for all X G X. Let us now assume that (Px)xex is weakly 
strict. Then for any fixed A; € IN by repeated substitution we can obtain an equivalent 
algebraic system (P x )xex such that for all X G X any w G supp(-P^) \ A* contains at 
least k letters from A. We conclude that any weakly strict algebraic system (Px)x<=x has 
a unique solution (Sx)xex which is given by (Sx,w) = (P x ,w) for all w G A* such that 
\w\ < k. 

Now, we continue by manipulating (P x )xex- Let again X G X and let w G supp(P^) 
with \w\ < k. Let Y G X \ {X}. For any possible choice of occurrences of X in the support 
of Py we substitute these occurrences by w. More precisely, for all Y G X \ {X} replace 
P Y by the polynomial 

^ (P Yl u 1 Xu 2 ■ ■ ■ UiXu i+ i).ui ■ (P x ,w).w ■u 2 ---u i - (P x ,w).w ■ u i+1 . 

ui,U2,...,Ui,u i+ ie(AuX)* 

Furthermore, replace P x by the polynomial 

1(AUA')*\{U)} © 

^ (P X ,UlXu 2 ■ ■ .UiXu i+ l).Ui ■ (P x ,w).w ■ u 2 ■ ■ -Ui ■ (P x ,w).w ■ u i+1 . 

ui,U2,...,Ui,u i+ ie(AUX)* 

Observe that these sums are in fact finite and note that in these definitions the factors 
ui, u 2 , ■ ■ ■ , Ui,Ui + \ G (A U X)* may contain occurrences of X. The resulting system is again 
weakly strict and has thus a unique solution (S' x )xex- A straightforward but cumbersome 
calculation, which we omit here, shows, using the distributivity of the semiring of formal 
power series, that S' Y = Sy for all Y G X \ {X} and S x = 1(au^)*\{«>} &x- F° r 
fixed < k' < k by repeated application we can thus obtain a proper and weakly strict 
algebraic system (Rx)xex such that the quasiregular and unique solution (Tx)x<=x is given 
by (Ijy, | k'<\w\} Sx)xex- In particular, it follows that the quasiregular part 1^+ Sx 
of Sx is algebraic for any X G X. 

6.1. Nested Word Series and Their Projections. Next, we consider the projections 
of regular nested word series and show that they give rise exactly to the algebraic series. 
The projection 7r(nw) of a nested word nw = (id, u) G NW(A) is simply the word w, 
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i.e. we forget the nesting relation. This projection is canonically generalized to languages 
L C NW(A) by setting ir(L) = {ir(m«) | nw G L} and to series S : NW(A) ^ IK by letting 

tt(5) : A* -> K 

it; h4 (5, nw). 

mceNW(A) 
ui=7r(n«i) 

Proposition 6.3. Let S : NW(A) — > K be regular. Then n(S) : A* — > IK is an algebraic 
formal power series. 

Proof. Let .4 = (Q, t, 5, «) be a WNWA such that ||^4||= 5. We define a weakly strict alge- 
braic system {P(q 1 ,q 2 ))qi,q2eQ with variables in Q 2 such that for its solution {S^ qi ^) qim ^Q 
we have for all w G A* with \w\ > 1: 



E E v&a(t)- (6.i) 



1 f:Ql-S-92 
7r(m«)=u> 

The idea is to simulate the transitions of a weighted nested word automaton. For this 
we will partition the set of nested words of length at least two in three different classes. 
First the class of nested words where the first and the last position are either corresponding 
call and return positions or both internal positions. The second class consists of nested 
words where either the first position is a call position and the last position is an internal 
position or the last position is a return position and the first position is an internal position. 
And the last class consists of any other, i.e. where the first position is a call position and 
the last position is a return position which do not correspond to each other. Using this 
partition we define for all (71,(72 £ Q the polynomial Pf qi q2 ) '■ (A U Q 2 )* — > IK as follows: 

(• P (9l,«B)> U = 

1 if q% = (72 and w = e 

<$int(?i) a , Q2) if w = a for some a £ A 

$mt(qi,a,q 3 ) • <5mt(?4,&,?2) + if w = a(93,94)^ 

<5call(?i, a, 93) ■ 5 rct (94, qi,b, q 2 ) for some a, b G A, q 3 , 94 G Q 

<W(?i,a,g 3 ) • <W?4, 91 A 95) • <W?6, c > 92) + if to = a(q 3 ,q 4 )b(q 5 ,q 6 )c 
Sint(qi,a, 93) • 4aii(94, &, 95) • <W?6, 94, c, 92) for some o, 6, c G A 

and 93,94,95,96 e <5 
5call(9i,a,93) • 5ret(94,9i,^,95) • w = a(q 3 , q 4 )b(q 5 , q 6 )c(q 7 , q 8 )d 

4aii(96, c, 97) • <W98, 96, d, 92) for some a, b, c, d G A 

and 93,94,95,96,97,98 G Q 
otherwise. 

This is a weakly strict algebraic system having a necessarily unique solution (Sr qiiq2 \) qitq2 £Q. 
We show by induction on the length of to that (|6.ip holds. For |io| = 1 this is easy to see. 
Now let I to > 1. Then 
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= ^ G <3 *int ai,93) • <W94,O n ,92) + ^11(^1,01,93) • ^ret(94,9l,O n ,g2) 

• W(g3,<?4)' a 2 • • ■ a n-l) + 

+ ^2 E ^call(0l, ai, g 3 ) • <5 ret (a4,<?l, a,, 95) • <5int(96, «n, 92) + 

2<i<n-l 53,54, (J5,(J6G<3 

+ <5int (91 , «1 , 93) ■ 4all (94 , , 95 ) • <^rct (96 , 94 , O n , 92 ) 
' ( S (q 3 ,qi)' a 2--- a i-l) ' ( S (g5,<?6) > • 

+ E E ^caii(9i, «i, 93) • ^rct(94, 9i, ai, 95) • 4aii(96,aj, 97) • <^rct(98, 96, a„, 92) 

2<i<J<n-l <?3,94, 

95,56,57,586Q 

" (^(?3,54)' °2 • • - a i-l) • (^(gs.ge)' ^!---^'-!) ' ( S {qT,qa) > a J+l • • • a n-l) 

(9l,ai,93)- E E W g^( r ) • ^mt (94, On, 92) + 

!,?4eQ ™eNW(A) ^53^54 

7rfnui)=a9...a„_i 



+ 



93 . 

w(nw)=a2---a n -i 

+ S ca n(q 1 ,a 1 ,q 3 ) ■ ^ ^ wgt^(r) • <5 ret (94, 9i, a„, 92) 

mueNW(A) r: g 3 !^ g4 
7r(nw)=a2...a„_i 

+ E E [^311(91,^1,93) • E E w g^( r i) • ^rct (94,91, aj, 95) 

2<i<n-l 93,54, mi)ieNW(A) ""11 

95, % GQ 7 r(n W i)=a 2 ...a-_ 1 n:93 ^ 94 

E E wgt -4 ( r2 ) ' ^ int (?6 , a„ , 92 ) 

™ 2 eNW(A) r2;(?5 "42, 6 

7r(niU2 j=aj+i...o„_i 

+ <^int(9l, ai,93) • E E w S t ^( r l) " 

n«,i6NW(A) ri:gs »4i, 4 
7r(nuii)=a2...a 4 _i 



• ^call(94, Oi, 95) • E Yl W g^( r 2) • <W96, 94, O n , 9 2 ) 

ni/wPNWfA) .. . nw 2 



n» 2 6NW(A) r2 : 96 "4 2 56 
T(nt«2)=ai+i---a n - - 



+ 



+ E E ^311(91,01,93)- E E W gt^( r l) • <^rct(94, 91, Oi, 95) • 

2<i<j<n-l 93,54,55, ™i£NW(A) -a".^ 1 ™ 

1e,qr,q S eQ 7 r(r iWl )=a 2 ...a I _ 1 ri " 93 ^ 94 

E E W S^( r 2) • 

^26NW(A) r2 :5 5 ^ 2 5 6 
7r(nt«2)=ai + i...aj_i 

■ ^311(96,0^,97) ■ Yl W § t ^( r 3) • °ret (98,96, O n , 92) 

n«, 3 eNW(A) r3;(?7 "43 ?8 
7r(nui3)=aj + i...a n _i 

1 E E w s^( r )- 

7r(riw)=«, r .q 1 !^g 2 
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Now, let X be a fresh variable and extend (-P(q 1 ,g 2 ))gi,g 2 e(3 by adding the new polynomial 
Px = Y^iqx q 2 €Q L ( a ^) ' K (^2)--P(gi,g 2 )- Clearly, the unique solution of this extended system is 
obtained by adding S x = E gi , 92 eQ ' K (to)- S (.qi,qa) to (S^^VgaeQ- The quasiregular 
part of S"x equals 7r(||^4||) which is thus algebraic by our considerations after Definition 16.21 

□ 

Given an algebraic system (Px)xeX over (A U X) and some X G X , we define the 
underlying grammar Gx = (A, X , X, F) where the set F C A? x (X U A)* of productions is 
given by letting (Y,w) G -F iff (Py,ii;) ^ 0. Let u £ A*. A derivation tree of u under 
is a finite tree t such that the following holds: 

(a) The root is labeled with (X,w) for some w G supp(Px)- 

(b) For each inner node v with label (Y, w) the first component of the labels of the children 
of v from left to right yield w. 

(c) The labels of the leaves from left to right yield u. 

We collect all derivation trees t of u under Gx in Der(Gx,u). Clearly, if (Px)xex is 
proper, then each inner node of t either has a single leaf attached or branches at least 
binarily. Hence, in this case Der (Gx,u) is a finite set. Let v be a node of t. If v is an inner 
node and (Y, w) its label, then we let wgt(i, v) = (Py,w). If v is a leaf, we let wgt(i, v) = 1. 
Now we define the weight wgt(t) of t by wgt(t) = \\ v node of t wgt(t, v). The following lemma 
seems to belong to what is sometimes called folklore, it can easily be shown by induction on 
the length of w. A proof of a similar but weaker result can be found in 137; Theorem IV. 1.5]. 

Lemma 6.4. Let (Px)xex be a proper algebraic system and let (Sx)xeX be its unique 
quasiregular solution. Then 

(S x ,w) = ^2 w gt(*) f or all X £ X and w G A*. 

t£Der(G x ,w) 

We now show the converse of Proposition 16.31 

Proposition 6.5. Let R : A* — > IK be an algebraic formal power series. Then there is a 
regular nested word series S : NW(A) — > K such that n(S) = R. 

Proof. Let (Px)xeX be a proper algebraic system with quasiregular solution (Sx)xeX 
and let Y G X such that R = Sy- We construct a WNWA A = (Q,l,S,k) such that 
7r(||^4||) = Sy- Any element in the support of some Px will define a transition in the 
automaton. In order not to produce e-transitions, we require that each word in the support 
of some Px contains an element of A, and in order to produce at most one call for each 
transition, each word in the support of some Py contains at most two elements of X. 
Therefore we assume the algebraic system (Px)xex to be in Greibach normal form [26], i.e. 
we require that supp(Px) Q AuAA'UAA'A' for all X € X. Elements of AXX produce call 
transitions, elements in AX produce internal transitions and elements in A produce return 
transitions. More precisely, let Q = (X U {J-}) x (X U {J-}) for some fresh symbol _L, and 
for all Xi, X 3 , X A G X and X 2 G X U {_L} let 

<5 caU ((X 1 ,X 4 ),a, (X 3 ,X 2 )) = (P Xl ,aX 3 X 4 ) 

6 int ((X 1 ,X 2 ),a, (X 3 ,X 2 )) = (P Xl ,aX 3 ) 

S Tet ((X x ,X 2 ),(X 3 ,X A ),a, (X 4 ,X 2 )) = (P Xl ,a). 
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Moreover, let S- m t((Xi, _L), a, (_L, _L)) = (Px 1 ,a). Any other transition gets weight 0. Fur- 
thermore, for all X, Z G X U {±} we let 

>[x,z ) = { 1 ' tx = Y < (V ) = (' if * = z = ^ 

I otherwise I otherwise. 

The idea is to simulate a derivation tree of the underlying grammar Gy traversed 
from the left to the right. More precisely, when processing a production (X\, 0X3X2) in 
a derivation tree, then a call transition is executed and we continue in a state with first 
component X3. At the return position the automaton changes to X%. Since the automaton 
looks back to the state in which the automaton was before the corresponding call position, it 
has to guess X2 in advance. This is stored in the second component which was introduced 
for this reason. One can show by induction on \w\ that for all w G A*, X G X and 
Z' G X U {_L} we have 

(S x ,wa)= E wgt A (r).(P x/ ,a) (6.2) 

i/GNest H X'eX,ZeXU{±} 
r:(X,Z) {w 4\x> ,Z') 

where we make the convention that there is a run r : (X,Z) (X',Z r ) iff X = X' and 
Z = Z' . Moreover, for this run we let wgt^(r) = 1. Now the result follows easily from the 

observation that by the definition of 5 the last transition of a run r : (Y, Z) ^-V' 1 (J-,_L) 
with wgt(r) 7^ must be an internal transition. □ 

Subsequently we make use of the following well known result |26j . We just indicate 
how it can be obtained in this context using Propositions 16.31 and 16.51 but note that a more 
elementary proof and more general results can be found in |26^ Chapter 15]. 

Corollary 6.6 (Kuich & Salomaa [251 Lemma 15.2]). Let S : A* — >■ IK be an algebraic 
formal power series. Then there is an algebraic system (Px)xex such that supp(Px) Q 
A U A(A U X)*A for all X £ X and S = S x for some X G X. 

Proof By Proposition 16.51 S is the projection of some regular nested word series R : 
NW(A) -> IK. Now let A be a WNWA and Q its set of states such that ||.A||= R. Consider 
the weakly strict algebraic system (Px, (P(q 1 ,q 2 ))qi,q2&Q) °f the proof of Proposition 16.31 and 
its unique solution (Sx, (S^ qi q2 ^) qim< zQ). Using the manipulations given after Definition 16.21 
we can transform this system into a system of the required form having as a solution the 
quasiregular part of Sx which equals S. □ 



6.2. A Logical Characterization of Algebraic Formal Power Series. Our aim is to 

give a logical characterization of algebraic formal power series in the spirit of Lautemann, 
Schwentick and Therien [27] . They showed that the context-free languages are precisely 
the languages which can be defined by second-order sentences over words of the form 3zaw 
where 93 is a first-order formula and v a binary predicate ranging over nesting relationsj. 
We identify a word a\ . . . a n G A* with the structure ([n], <, A), where < is the canonical 
order of [n] and A : [n] — > A is given by X(i) = ai for all i G [n]. Let (p be a weighted 
second-order formula over words containing, apart from a single 2-ary relation variable v, 



'In [27] nesting relations were named matchtngs. 
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only 1-ary relation variables. In other words let ip G MSO(K, A, <, v). Let Fiee(p) C V, 
w G A* and 7 a (V, w;)-assignment. We define the semantics pzA</?] nest : A* — > K by letting 

(P^] iest ,(»,7))= £ (M,(K^),7))- 

i/gNest|„, 

Using our characterization of nested word automata by means of weighted logics (The- 
orem [321), we m &y reformulate Proposition 16.31 as follows: 

Corollary 6.7. Let ip G sRMSO(IK, A, <,u) be a sentence. Then I3v.ip\ nest : A* -> IK is an 
algebraic formal power series. 

Next we show a result which sharpens Proposition 16.51 For this we follow the proof of 
Lautemann, Schwentick and Therien [27} Theorem 2.1] with small changes in the details. 

Proposition 6.8. Let S be an algebraic formal power series. Then there is a sentence 
ip G sRFO(lK, A, <, v) such that S = \3v.ip\ nest . 

Proof. We use an idea of Lautemann, Schwentick and Therien [27] and adapt it to the 
weighted setting. This requires that we have to be more careful in order not to count 
weights twice. 

A normal form. By Corollary 16.61 we may assume that S is the component of the 
solution of an algebraic system with variables in X having all supports in A U A(A U X )*A. 
By the transformations discussed after Definition 16 . 2 1 we obtain from this a proper algebraic 
system (P x )xex with solution (Sx)xex such that for all X £ X, supp(P x ) does not 
contain elements of A U {e} and l^ wei \* \ \ w \>i} 5* is a component of the solution. Clearly, 
it suffices to show the proposition for the latter series instead of S. 

Now we proceed as in |27| and transform the system (P x )xex into an equivalent system 
(Px)x<=x- Let w G supp(P^) for some X G X. The image of w under the homomorphism 
which is the identity on A and maps any Y G X to the fresh symbol | is called the pattern 
patt(ui) of w. Let us now fix a strict linear order < on X. Similarly to [27] , we proceed 
along this linear order. Let X be the current symbol. In order to obtain Px we substitute 
iteratively some Z G X in some w G supp(P x ) by P' z (cf. considerations after Definition [672]) 
until for all Y E X, with Y < X, patt(w) ^ patt(V) for all w' G supp(Py) \ A* and 
w G supp(Px) \ A*. This is possible since by our considerations after Definition 16.21 we can 
ensure that all elements in supp(Px) \ A* are longer than all elements in supp(Py) \ A* for 
all Y < X. We finally obtain a proper algebraic system (Px)xex equivalent to (P' x )xex 
having the following properties: 

(1) supp(Px) C A(A U X)+A for all X £ X. 

(2) For all X, Y G X, if patt(t<;) = patt(V) for some w G supp(Px) \ A* and w' G 
supp(Py) \ A*, then X = Y. 

Let us fix Y G X . We now proceed by giving a sentence ipy G sRFO(IK) such that 7r([yy]]) = 
Sy ■ This will conclude the proof. 

Some macros. Let Gy be the underlying grammar (see the definition after the proof 
of Proposition I6.3P and let u G A*. The basic idea now is to assign to each derivation 
tree t G Der(Gy,u) a nesting relation u t of width This is done by letting (i,j) G v t 
if there is an inner node of t such that the leaves of the subtree rooted at this node are 
exactly the leaves between the ith and the jth leaf of t (in lexicographic order including 
the ith and the jth leaf). Clearly, due to the special form of (Px)xex this binary relation 
is indeed a nesting relation. Let us now define some macros for nested words. Let nw = 



WEIGHTED LOGICS FOR NESTED WORDS AND ALGEBRAIC FORMAL POWER SERIES 27 



(u, v) = (ai ...a,k,v) G NW(A). Then let min(x) and max(y) express that x is assigned the 
first position and y the last position. Furthermore, the formula inchild(x, y) express that 
(x, y) G v corresponds to an inner node of t which has an inner node as a child. 

inchild(x, y) = v(x, y) A 3z, z' . (x < z < y) A u(z, z') 

The macro surf(x, y, x\, y\) says that (xi,yi) is a surface arch of nw[x,y]: 

surf(x, y, xi,yi) = (x < x 1 < y 1 < y) A v(x\, yx) A 

A Vz, z . (x < z < x\ < yi < z' < y) -> ->u(z, z'). 

As in [27], for v G A* let ip v (x,y) be a first-order formula that expresses there is no call 
strictly between positions x and y and that the substring given by the positions strictly 
between position x and y equals v. For a word w = avb G A + define "& w (x,y) as follows. 

$w{x,y) = Lab a (x) A Lab b (y) A ip v (x,y). 

Now we will need the notion of a pattern also for nested words [27]. Let ji), ■ ■ ■ , (i s ,js) 
be the sequence of all surface arches of w. The pattern patt(nu;) of nw is the string 
ai . . . djj-i | a^+i . . . ai s ~i | o-js+i ■ ■ ■ o-k- Now, let X £ X,let W = av$XiV\ . . . v s -\X s v s b G 
supp(Px) \ A + and let p = patt(w) = avo\v\ . . .v s -\\v s b. We define the formula Xp( x ) 
(cf. [27]) which states that x is a call position with return position y and p&tt(nw[x, y]) = p. 

X P (x) =3y. u(x,y) ALab a (x) ALab fe (y) A 

A 3xi, Vi, ...,x s ,y s . (x < xi < yi . . . < y s < y) A i/) VQ (x, xt) A ... A ip Vs (y s ,y) A 

A (surf(x,y,xi,yi) A ... A surf(x, y, x s , y s ) 

Now let xx (x) be the disjunction of all Xp{ x ) over au patterns p of words w G supp(Px)\A + 
and let be the disjunction of all $ w (x,y) over w G supp(Px) H A + . Let again 

w = auoXi^i . . . f s _iX s u s 6 G supp(Px) \ A + . Similarly to [27] we define now the formula 

Xw{x,y): 

Xw{x, y) = 3y. u(x, y) A Lab a (x) A Lab fe (y) A 

A 3xi, 2/1, ...,x s ,y s . (x < x x < y x . . . < y s < y) A ^ (x,xi) A ... A ip Va (y s ,y) A 

A (surf(x, y,xi,yi) A ... A surf(x, y, x s ,y s )A 

A (xxi(zi) Vf9xi(^l,m)) A... A (xx s (x s ) Vtf Xs (x s ,y s )) . 

We show in the next paragraph that there is a bijective correspondence between the set of 
derivation trees t G Der(Gy , u) and the nested words (u, u) satisfying the following formula 

tpY = 3x, y. min(x) A max(y) A u(x, y) A {xy{x, y) V "!?y (x, y)^j A 

A Vz,z'. inchild(z,z') -> \/ X«j C- 2 , • 

wesupp(P x )\A+ 

T/ie formula. Given a derivation tree t G Der (GV,u) we assign to it a nesting relation 
vt as described above. Clearly, (l,n) G and either (u,u t ) \= i?y[l,n] or (w,v t ) \= xy[L,n]. 
Furthermore, if 1 < i < j < n and (u,vt) \= inchild[i, j], then there is an inner node of t 
such that the leaves of the subtree rooted at this node are exactly the leaves between the 
ith and the jth leaf of t. Let (X, w) be the label of this inner node, then (u,u t ) \= x w [i, j] 
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by construction and hence (u,vt) \= V 7 - Conversely, let v be a nesting relation such that 
{u,v) \= ip. We define a derivation tree t v inductively as follows. If {(l,n)} = v, then 
t v consists of a single inner node, the root, labeled by (Y,u). In this case we must have 
{u,v) \= $y[l, n] and hence t u is a derivation tree. Otherwise, let ji), • • • , (i s ,js) be the 
sequence of surface arches of (u, u \ {(1, n)}) and let a\ . . . a} ril \ . . . \a\ . . . a s ng \a{ +1 . . . 
be the pattern of (u, u\{(l, n)}). Moreover, for 1 < k < s let u[ik, jk] be the subword of u 
from the z^th position to the jfcth position. Then we must have 



\= V Xw[l,n] 



xex 
«>esupp(Px)\A+ 

and hence for all 1 < k < s we have (u, v)\ik, jk] \= ^x k for some G X. Thus by induc- 
tions hypothesis there are G Der(Gx fc , u[ik, jk})- We define t u to be the tree whose root 
is labeled (Y, a\ . . . a} nx X\ . . . X s al +1 . . . a*^) and where the trees rooted at the children of 
the root are as follows from left to right: a\ , . . . , a* 1 ,t±, . . . ,t s , a® + , . . . , • We conclude 
that t u is a derivation tree, since (u, v) \= %y[l,n]. 
Now we can give the formula <py. 

cpy =i>y A Vx '2/- u ( x i V) V (mchild(>,y) -±» (xw(x,y) + A (Px,w)) A 

xex 

u)Gsupp(P x ) 



A -inchild(x, y) -±> {$ w (x,y) + A (P X ,w))^j 



Let t G Der(Gy,u) and let i/| be the corresponding nesting relation. By construction 
flfcy], («, ^)) = wgt(t) and thus pZA^yf cst = <Sy by LemmaEl □ 

Let us summarize our results of this section so far. 

Theorem 6.9. Let K be a commutative semiring and let S : A* — > K be a formal power 
series. Then the following are equivalent: 

(1) S is an algebraic formal power series. 

(2) S = tt(R) for some regular R : NW(A) IK. 

(3) There is a sentence if G sRFO(!K, A, <, v) such that \3v.Lp\ nest = S. 
Proof. (1) =>- (3). This is Proposition 16.81 

(3) (2). Follows from Theorem 13.6( a) and the definition of ir. 

(2) (1). This is Proposition [631 □ 

Let IK = IN and let S : {a} + —> IN be an algebraic series. As S = vr(i?) for some regular 
nested word series R, it is not hard to see that (£, a n ) <2 n ■ c n for some constant c and all 
n G IN. Using weighted pushdown automata (cf. |26j) one can even show that (S,a n ) < c n 
for some constant c and all n G IN. Thus in item 3 of the last result we may not replace 
sRFO(K) by FO(K) since (|Vx.3y.l], a n ) = n n . 

Again we note that all proofs are effective and given a proper algebraic system (Px)xex 
with solution (Sx)x^x and an effectively given semiring IK, we can compute an sRFO(IK) 
sentence ipy for all Y G X such that Sy = pzA<^y]] nest and vice versa. 
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6.3. Yet Another Characterization of Algebraic Formal Power Series. Even though 
our logical characterization of regular nested word series (Theorem 13. 6p might also be ob- 
tained by structural induction, the connection between alternating texts and nested words 
we established enables us now to also obtain a generalization of the second main result 
of [27]. In this paper another logical characterization of context-free languages was given 
where quantification over nesting relations is now replaced by quantification over tree- 
definable orders. In [27] a linear order < on [n] was called tree-definable if there is a binary 
tree t with n leaves which are labeled 1, . . . ,n in lexicographic order and whose internal 
nodes are labeled with {^/, \} such that i < j iff i is visited before j in the depth- first 
traversal of t in which, at every node with label ^/ , first the left, and at every node with 
label \, first the right child is visited. We will give a slightly different definition which is 
easily seen to be equivalent by simply replacing by • and \ by o. 

Definition 6.10. Let n G and let <i be the canonical order of [n]. Moreover, let 
A : [n] — > A be a labeling. A linear order <2 of [n] is tree-definable iff ([n], A, <i, < 2 ) is an 
alternating text. 

We collect all tree-definable orders of [n] in TDO n . Our aim is now to extend the 
above mentioned result of [27] and to show, using the connection between nested words and 
alternating texts, that a formal power series is an algebraic formal power series iff it can be 
defined by a second-order sentence over words of the form 3 < 2 -<~p where <p is a first-order 
formula and < 2 a binary relation symbol ranging over tree-definable orders. Note that like 
matchings, tree-definable orders are first-order definable relations [22,30j. First, we start 
by defining the projection tt(t) of an alternating text r = ([rt], <i, < 2 , A) G TXT(A) to be 
the word ([n], <i, A), i.e. we forget the second order. As for nested words, this projection 
is canonically generalized to languages L C TXT(A) by setting tt(L) = {vr(r) | r G L} and 
to series S : TXT(A) -> IK by letting 



Proposition 6.11. Let S : TXT(A) K be regular. Then tt(S) : A* — > IK is an algebraic 
formal power series. 

Proof. Consider a WPA A = (H, V, Q, (jl, /u op , fi c \, A, 7) such that || A ||= S. Let X = 
(H 2 x {0, 1})U(V 2 x {0,1}). We define an algebraic system (Px)xex as follows: For all 
fal , /t2 G H and V\ , v% G V we let 

(PfaM,!),™) = ^2^(hi,a,h 2 ).a+ ^ ^Mop(Al, (s,v) ■ Hd(v', ) s , h 2 ).(v, v', 0) 



tt(5) : A* -> B< 



rGTXT(A) 




(P(huh2,0),w) 



£ (hi,h 3 , l)(h 3 ,h 2 , 1) + (h 1} h 3 , l)(h 3 , h 2 ,0) 



h 3 £H 




aeA h,h'eHseCl 



^2 ( U 1' U 3,1)(V3,W2,1) + (V1,V 3 ,1)(V 3 ,V 2 ,0) 
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We claim that this algebraic system has a unique quasiregular solution (Sx)xex which 
consists of algebraic formal power series. Indeed, if we replace the polynomial P(h\,h2,\) by 
the polynomial 

^2fi(h 1 ,a,h 2 ).a+ ^ ^ MopOl, U v) ■ fJ< c \(v', ) s , h 2 ).P( v , v > j o) 

aGA v,v'€Vs£Q, 

and the polynomial P( Vl ,v 2 ,l) by the polynomial 

^2^(v u a,v 2 ).a+ ^ ^2vop(vi,{ s ,h) ■ l^ c \(h',) s ,v 2 ).P(h,h',o), 

aGA h,h'&T-Ls&il 

we obtain an equivalent system (cf. manipulations after Definition I6.2D which is proper and 
has thus the unique quasiregular solution (Sx)xgx which consists of algebraic formal power 
series. Let TXT C TXT(A) be the set of all alternating texts which are either singletons 
or o-products. Analogously let TXT* C TXT(A) be the set of all alternating texts which 
are either singletons or ^-products. We will show by induction that we have for all w € A* 
with \w\ > 1 

{Sq 1iM , 1) ,w)= Y w g^( r ) and ( 6 - 3 ) 

rGTXT* r:hl I> h2 
tt{t)=w 

( S (v 1} v 2 ,l), w ) = Y Y W &A( r ) ( 6 - 4 ) 
tGTXT° , 



7T[T)=W 



as well as 



(S( hl ,h 2 ,i)i w ) + ( s (h 1: h 2 ,o),w) = Y Y wgt^(r) and (6.5) 



tGTXT(A) r:hl i> h2 

it(t)=w 

(S(v t ,v2,i),w) + (S( Vl ,v 2 ,o),w) = Y Y w &A( r )- ( 6 - 6 ) 

reTXT(A) r . Vl Z, V2 
■k(t)=w 

The result then follows immediately from the fact that algebraic formal power series are 
closed under pointwise sum and scalar multiplication. Let w = a for some aGA. Since the 
series S( hlM>1 ) and «%, llV2) i) are quasiregular, we obtain that (S( VuV2)0) , a) = {S( hlMfi) , a) = 
0. Prom this it is easy to deduce the induction base. Let now \w\ > 1. Then 

(% 1 ,/i 2 ,l),w) = Y Y^ hl ' ( s ' u ) ' ^\{v',) s ,h 2 ) ■ (S {VtV , )0) ,w) 
u,ti'ev sgH 

= Y X]Mop(^i,( s ,w) • 

v.v'GV sGf2 



Y Y ( S (v,v 3 ,l)> W l) ■ ( S (v 3 ,v',l),W2) + (S( V}V3>1 ^,Wi) ■ (S( V3>v >,0),W2)j ■ fJ>cl(v',) s ,h 2 ) 
V3 GV w=wiw 2 



( Y Y ( S (v,v 3 ,l)i w l) ■ {( S (v 3 ,v',l)> w 2) + ( S (v 3 ,v',0),W2))) ■ fJ-cl(v',) s ,h 2 ) 



u 3 GV w=wiw 2 
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v,v'ev sen 

Y Y Y Y w S^( r i) • Y Y w gU( r l)) ■ V>ci(v',) s ,h 2 ) 

' v a eV w=w lW2 Tl ^TXT° ri . v ^ Va ra eTXT(A) ri , V3 ^ v , 

7t(ti)=«)1 " n(r 2 )=W 2 

Since TXT(A) is the free bisemigroup, given some w of length at least two, each r G TXT* 
with 7t(t) = w decomposes uniquely into r = r\ • r 2 with G TXT° and T2 G TXT(A). 
Hence we can continue 

= Y Y w &A( r )- 

tGTXT* r:hl ^ h2 
tt{t)=w 

Analogously we get Equation (16. 4p . Similarly we get: 
(S^ hl! h 2) i),w) + {S ihlM)0) ,w) = 

= Y Y w &A( r ) + Y Y (%i^3,i)' u; i)-((%3,fe2,l)>' w 2) + (%3,fe,o),^2)) 

7V{t)=W 

= Y Y w &A( r ) + 



T&TXT' r:hl J+ h2 

7V{t)=W 



Y Y Y Y w ^A(n)- Y Y w ^A(n) 

h3£n w=m r 6TXT- n:hl -4 h3 t 2 GTXT(A) 
n(Ti)=wi 7r(r 2 )=ui 2 

= Y Y w g^( r ) + Y Y w sW r ) 

rGTXT* r -h 1 ^h 2 tGTXT° r -h 1 ^h 2 

n(r)=W 7v(t)=W 

= Y Y w &A( r )- 

tGTXT(A) r:hl ^ 2 
7t(t)=w 

Again Equation (|6.6[) can be shown analogously, which concludes the proof. □ 

Now we get our second characterization of algebraic formal power series. For this, we 
proceed as follows: Let i/jbea weighted second-order formula over words containing, apart 
from a single 2-ary relation variable < 2 , only 1-ary relation variables. In other words, let 
if G MS0(1K, A, <i, < 2 ). Let Free(v?) C V, u> = ([n], <i, A) G A* and 7 a (V, w)-assignment. 
We define the semantics [3 < 2 ■<p] tdo : A* ->• IK by letting 

(P< 2 .^l tdo ,K7))= £ (M,((M,<i,<2,A), 7 )). 

<2GTDOn 

Theorem 6.12. Let K be a commutative semiring and let S : A* — > IK 6e a formal power 
series. Then the following are equivalent: 

(1) S is an algebraic formal power series. 

(2) S = ir(R) for some regular R : TXT(A) -> IK. 

(3) There is a sentence <p G sRFO(!K, A, < 1; < 2 ) such that [3 < 2 .^] Mo = 5. 
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Proof. (1) =>■ (3). Let 5 : A* — > K be an algebraic formal power series. By Theorem 16.91 
there is an sRFO(IK) sentence over nested words such that pzAc^] nest = S. By Corollary 15.61 
the partial function Q^ 1 is FO-definable without parameter. Similar to Proposition 15.21 one 
can show that there is thus an sRFO(lK) sentence tp' over texts such that ^o([[^]) = [y'J- 
Now we can calculate using observations (jaj) and (jbj) before Lemma 15.31 as follows. 

(PM nes \^)= E (M>™») = E (M,nw) 

ntoGNW(A) mc£NW(A) 
■n(nw)=w Tr(^ (nw))=w 

= E (H.*r 1 W)= E (b'l,r) = (p< 2 y] tdo , U ;). 

re*o(NW(A)) t£$„(NW(A)) 
7t(t)=U! 7t(t)=u> 

(3) => (2). Follows from Theorem 14.21 and the definition of it. 

(2) => (1). This is Proposition [6TTJ □ 



7. Concluding Remarks and Future Work 

We introduced a quantitative automaton model and a quantitative logic for nested 
words and showed that they are equally expressive. This generalizes the logical character- 
ization of the unweighted case as given in [3]. Moreover, we established a new connection 
between nested words and alternating texts. Applying the result, we obtained a charac- 
terization of algebraic formal power series in terms of weighted logics. Presumably, the 
logical characterization of regular nested word series could also be obtained by structural 
induction. However, the connection between alternating texts and nested words enabled 
us to also obtain a second characterization of algebraic formal power series. Note that 
even though the characterizations of algebraic formal power series are generalizations of 
the results of [27] to a weighted setting, in contrast to the latter paper we gave a different 
proof using this connection as well as (weighted) nested word automata and (weighted) 
parenthesizing automata. Also note that weighted nested word automata and weighted 
parenthesizing automata were characterized algebraically in [31J. 

Let us remark that regular formal power series also fall into the pattern of our char- 
acterizations (Theorem 16.91 and Theorem I6.12p of algebraic formal power series. In fact, 
Thomas showed that a single existential monadic second-order quantifier suffices to charac- 
terize finite automata [391 Theorem 5.2]. That is, in the pattern of the last results we can 
formulate that L C A* is regular iff L = pM.(^J set for some first-order formula (p (where 
pAf.^j] means that we sum over all subsets M of the domain of a given structure). Let 
us explain the idea of the proof with an example. Given an automaton A with set of states 
Q = {0, l} k for L and some word a\ . . . (i2k £ L, the idea is to think of the interpretation of 
M as a word u\U2 where u\,U2 6 {0, l} fc = Q and to express by (p that u\ is an initial state, 
that there is a run from u\ to U2 on a\ . . . at and that there is a run from U2 into a final 
state on a^+i . . . ci2k- Alternatively, one can prove the result similarly to Proposition 16.81 
where one starts with a right-regular system and applies a similar transformation. Then a 
set M suffices to encode a derivation tree since any inner node has at most one non-terminal 
child whose position is collected in M. In any way, it is not hard to see that the proof can 
be adapted to a weighted setting. So, also in the weighted case we can restrict ourselves to 
a single existential monadic second-order quantifier. 
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Following these pattern it might be interesting to further investigate whether other 
important classes of formal power series can be characterized in this manner. Again, the 
work of Lautemann, Schwentick and Therien [27] can be used as a starting point where the 
so-called fc-linear languages were considered. 
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